The JPEG standard is widely used in different image processing applications. One of the main components of the JPEG standard is the quantisation table (QT) since it plays a vital role in the image properties such as image quality and file size. In recent years, several efforts based on population-based metaheuristic (PBMH) algorithms have been performed to find the proper QT(s) for a specific image, although they do not take into consideration the user's opinion. Take an android developer as an example, who prefers a small-size image, while the optimisation process results in a high-quality image, leading to a huge file size. Another pitfall of the current works is a lack of comprehensive coverage, meaning that the QT(s) can not provide all possible combinations of file size and quality. Therefore, this paper aims to propose three distinct contributions. First, to include the user's opinion in the compression process, the file size of the output image can be controlled by a user in advance. Second, to tackle the lack of comprehensive coverage, we suggest a novel representation. Our proposed representation can not only provide more comprehensive coverage but also find the proper value for the quality factor for a specific image without any background knowledge. Both changes in representation and objective function are independent of the search strategies and can be used with any type of population-based metaheuristic (PBMH) algorithm. Therefore, as the third contribution, we also provide a comprehensive benchmark on 22 state-of-the-art and recently-introduced PBMH algorithms on our new formulation of JPEG image compression. Our extensive experiments on different benchmark images and in terms of different criteria show that our novel formulation for JPEG image compression can work effectively.
translated by 谷歌翻译
加强学习中的序列模型需要任务知识来估计任务策略。本文提出了一种用于从演示中学习序列模型的层次结构算法。高级机制通过选择后者来达到的子目标来指导低级控制器。该序列取代了以前方法的回报,从而提高了其整体性能,尤其是在较长的情节和稀缺奖励的任务中。我们在OpenAigym,D4RL和Robomimic基准测试的多个任务中验证我们的方法。我们的方法的表现优于在不同的地平线任务中的八个任务中的八个基准和没有事先任务知识的奖励频率,这显示了使用序列模型从演示中学习的层次模型方法的优势。
translated by 谷歌翻译
客户满意度在移动设备中的能源消耗至关重要。应用程序中最耗能的部分之一是图像。尽管具有不同质量的不同图像消耗了不同量的能量,但没有直接的方法来计算典型图像中操作的能量消耗。首先,本文调查了能源消耗与图像质量以及图像文件大小之间存在相关性。因此,这两者可以被视为能源消耗的代理。然后,我们提出了一种多目标策略,以增强图像质量并根据JPEG图像压缩中的定量表减少图像文件大小。为此,我们使用了两种一般的多目标元启发式方法:基于标量和基于帕累托。标量方法找到基于组合不同目标的单个最佳解决方案,而基于帕累托的技术旨在实现一组解决方案。在本文中,我们将策略纳入五种标量算法,包括能量感知的多目标遗传算法(ENMOGA),能量感知的多目标粒子群优化(ENMOPSO),能量感知的多目标多目标差异进化(ENMODE)(ENMODE)(ENMODE) ,能源感知的多目标进化策略(ENMOES)和能量感知的多目标模式搜索(ENMOPS)。此外,使用两种基于帕累托的方法,包括非主导的分类遗传算法(NSGA-II)和基于参考点的NSGA-II(NSGA-III),用于嵌入方案,以及两种基于帕累托的算法,即两种基于帕累托的算法,即提出了Ennsgaii和Ennsgaiii。实验研究表明,基线算法的性能通过将拟议策略嵌入到元启发式算法中来提高。
translated by 谷歌翻译
如果我们给机器人将对象从其当前位置移至未知环境中的另一个位置的任务,则机器人必须探索地图,确定所有类型的障碍物,然后确定完成任务的最佳途径。我们提出了一个数学模型,以找到一个最佳的路径计划,以避免与所有静态和移动障碍物发生冲突,并具有最小的完成时间和最小距离。在此模型中,不考虑障碍物和机器人周围的边界框,因此机器人可以在不与它们相撞的情况下非常接近障碍物移动。我们考虑了两种类型的障碍:确定性,其中包括所有静态障碍,例如不移动的墙壁以及所有动作具有固定模式和非确定性的移动障碍,其中包括所有障碍物,其运动都可以在任何方向上发生任何方向发生概率分布随时。我们还考虑了机器人的加速和减速,以改善避免碰撞的速度。
translated by 谷歌翻译
遥感是通过测量其反射和发射辐射在距离处检测和监测区域物理特征的过程。它广泛用于监测生态系统,主要用于保存。不断增长的入侵物种报告影响了生态系统的自然平衡。当引入新的生态系统时,外来的入侵物种会产生关键的影响,并可能导致本地物种的灭绝。在这项研究中,我们专注于欧盟被认为是一种水生侵入性物种的普发剂。它的存在会对周围的生态系统和人类活动(例如农业,捕鱼和航行)产生负面影响。我们的目标是开发一种识别物种存在的方法。我们使用了由无人机安装的多光谱传感器收集的图像来实现这一目标,从而创建了我们的Ludvision数据集。为了鉴定收集图像上的靶向物种,我们提出了一种检测路德维希亚p的新方法。在多光谱图像中。该方法基于修改以处理多光谱数据的现有最新语义分割方法。提出的方法达到了生产商的准确性0.799,用户的准确性为0.955。
translated by 谷歌翻译
This paper presents a framework for learning visual representations from unlabeled video demonstrations captured from multiple viewpoints. We show that these representations are applicable for imitating several robotic tasks, including pick and place. We optimize a recently proposed self-supervised learning algorithm by applying contrastive learning to enhance task-relevant information while suppressing irrelevant information in the feature embeddings. We validate the proposed method on the publicly available Multi-View Pouring and a custom Pick and Place data sets and compare it with the TCN triplet baseline. We evaluate the learned representations using three metrics: viewpoint alignment, stage classification and reinforcement learning, and in all cases the results improve when compared to state-of-the-art approaches, with the added benefit of reduced number of training iterations.
translated by 谷歌翻译
对象姿态估计有多个重要应用,例如机器人抓握和增强现实。我们提出了一种估计了提高当前提案的准确性的6D对象的6D姿势,仍然可以实时使用。我们的方法使用RGB-D数据作为段对象的输入并估计它们的姿势。它使用具有多个头部的神经网络,一个头估计对象分类并生成掩码,第二估计转换向量的值,最后一个头估计表示对象旋转的四元轴的值。这些头部利用特征提取和特征融合期间使用的金字塔架构。我们的方法可以实时使用,其低推理时间为0.12秒并具有高精度。通过这种快速推理和良好准确性的组合,可以在机器人挑选和放置任务和/或增强现实应用中使用我们的方法。
translated by 谷歌翻译
具有最小MakeSpan的机器人网络云系统中的调度方法有益,因为系统可以以最快的方式完成分配给它的所有任务。机器人网络云系统可以转换为节点代表具有独立计算功率的硬件的图形,并且边缘代表节点之间的数据传输。关于任务的时间窗口限制是订购任务的自然方式。 MakEspan是节点开始执行其第一个预定任务时的最大时间量,并且当所有节点都完成了上次计划任务时。负载平衡分配和调度可确保第一个节点完成其预定任务时的时间以及所有其他节点完成其预定任务时尽可能短。我们提出了一种新的负载平衡算法,用于任务分配和用最小的MakeSpan调度。理论上,理论上证明了所提出的算法的正确性和显示所获得的结果的现有模拟。
translated by 谷歌翻译
Counterfactual explanation is a common class of methods to make local explanations of machine learning decisions. For a given instance, these methods aim to find the smallest modification of feature values that changes the predicted decision made by a machine learning model. One of the challenges of counterfactual explanation is the efficient generation of realistic counterfactuals. To address this challenge, we propose VCNet-Variational Counter Net-a model architecture that combines a predictor and a counterfactual generator that are jointly trained, for regression or classification tasks. VCNet is able to both generate predictions, and to generate counterfactual explanations without having to solve another minimisation problem. Our contribution is the generation of counterfactuals that are close to the distribution of the predicted class. This is done by learning a variational autoencoder conditionally to the output of the predictor in a join-training fashion. We present an empirical evaluation on tabular datasets and across several interpretability metrics. The results are competitive with the state-of-the-art method.
translated by 谷歌翻译
Compared to conventional bilingual translation systems, massively multilingual machine translation is appealing because a single model can translate into multiple languages and benefit from knowledge transfer for low resource languages. On the other hand, massively multilingual models suffer from the curse of multilinguality, unless scaling their size massively, which increases their training and inference costs. Sparse Mixture-of-Experts models are a way to drastically increase model capacity without the need for a proportional amount of computing. The recently released NLLB-200 is an example of such a model. It covers 202 languages but requires at least four 32GB GPUs just for inference. In this work, we propose a pruning method that allows the removal of up to 80\% of experts with a negligible loss in translation quality, which makes it feasible to run the model on a single 32GB GPU. Further analysis suggests that our pruning metrics allow to identify language-specific experts and prune non-relevant experts for a given language pair.
translated by 谷歌翻译